Goto

Collaborating Authors

 Marlborough


State of Health Estimation of Batteries Using a Time-Informed Dynamic Sequence-Inverted Transformer

Patel, Janak M., Ramezankhani, Milad, Deodhar, Anirudh, Birru, Dagnachew

arXiv.org Artificial Intelligence

The rapid adoption of battery-powered vehicles and energy storage systems over the past decade has made battery health monitoring increasingly critical. Batteries play a central role in the efficiency and safety of these systems, yet they inevitably degrade over time due to repeated charge-discharge cycles. This degradation leads to reduced energy efficiency and potential overheating, posing significant safety concerns. Accurate estimation of a State of Health (SoH) of battery is therefore essential for ensuring operational reliability and safety. Several machine learning architectures, such as LSTMs, transformers, and encoder-based models, have been proposed to estimate SoH from discharge cycle data. However, these models struggle with the irregularities inherent in real-world measurements: discharge readings are often recorded at non-uniform intervals, and the lengths of discharge cycles vary significantly. To address this, most existing approaches extract features from the sequences rather than processing them in full, which introduces information loss and compromises accuracy. To overcome these challenges, we propose a novel architecture: Time-Informed Dynamic Sequence Inverted Transformer (TIDSIT). TIDSIT incorporates continuous time embeddings to effectively represent irregularly sampled data and utilizes padded sequences with temporal attention mechanisms to manage variable-length inputs without discarding sequence information. Experimental results on the NASA battery degradation dataset show that TIDSIT significantly outperforms existing models, achieving over 50% reduction in prediction error and maintaining an SoH prediction error below 0.58%. Furthermore, the architecture is generalizable and holds promise for broader applications in health monitoring tasks involving irregular time-series data.


DETNO: A Diffusion-Enhanced Transformer Neural Operator for Long-Term Traffic Forecasting

Ahmad, Owais, Ramezankhani, Milad, Deodhar, Anirudh

arXiv.org Artificial Intelligence

Accurate long-term traffic forecasting remains a critical challenge in intelligent transportation systems, particularly when predicting high-frequency traffic phenomena such as shock waves and congestion boundaries over extended rollout horizons. Neural operators have recently gained attention as promising tools for modeling traffic flow. While effective at learning function space mappings, they inherently produce smooth predictions that fail to reconstruct high-frequency features such as sharp density gradients which results in rapid error accumulation during multi-step rollout predictions essential for real-time traffic management. To address these fundamental limitations, we introduce a unified Diffusion-Enhanced Transformer Neural Operator (DETNO) architecture. DETNO leverages a transformer neural operator with cross-attention mechanisms, providing model expressivity and super-resolution, coupled with a diffusion-based refinement component that iteratively reconstructs high-frequency traffic details through progressive denoising. This overcomes the inherent smoothing limitations and rollout instability of standard neural operators. Through comprehensive evaluation on chaotic traffic datasets, our method demonstrates superior performance in extended rollout predictions compared to traditional and transformer-based neural operators, preserving high-frequency components and improving stability over long prediction horizons.


Latent Representations of Intracardiac Electrograms for Atrial Fibrillation Driver Detection

Peiro-Corbacho, Pablo, Lin, Long, Ávila, Pablo, Carta-Bergaz, Alejandro, Arenal, Ángel, Sevilla-Salcedo, Carlos, Ríos-Muñoz, Gonzalo R.

arXiv.org Artificial Intelligence

Atrial Fibrillation (AF) is the most prevalent sustained arrhythmia, yet current ablation therapies, including pulmonary vein isolation, are frequently ineffective in persistent AF due to the involvement of non-pulmonary vein drivers. This study proposes a deep learning framework using convolutional autoencoders for unsupervised feature extraction from unipolar and bipolar intracavitary electrograms (EGMs) recorded during AF in ablation studies. These latent representations of atrial electrical activity enable the characterization and automation of EGM analysis, facilitating the detection of AF drivers. The database consisted of 11,404 acquisitions recorded from 291 patients, containing 228,080 unipolar EGMs and 171,060 bipolar EGMs. The au-toencoders successfully learned latent representations with low reconstruction loss, preserving the morphological features. The extracted embeddings allowed downstream classifiers to detect rotational and focal activity with moderate performance (AUC 0.73-0.76) This work highlights the potential of unsupervised learning to uncover physiologically meaningful features from intracardiac signals. Introduction Atrial Fibrillation (AF) is the most common sustained cardiac arrhythmia in adults, affecting an estimated 59 million people around the world in 2019 [1]. It is defined as a supraventricular tachyarrhythmia characterized by disorganized electrical activity of the atrium and ineffective atrial contraction [2]. As life expectancy increases worldwide, the prevalence of AF is expected to rise accordingly [3]. Although some patients may be asymptomatic, many experience symptoms such as palpitations, fatigue, and dyspnea.


A Denoising VAE for Intracardiac Time Series in Ischemic Cardiomyopathy

Ruipérez-Campillo, Samuel, Ryser, Alain, Sutter, Thomas M., Feng, Ruibin, Ganesan, Prasanth, Deb, Brototo, Brennan, Kelly A., Pedron, Maxime, Rogers, Albert J., Kolk, Maarten Z. H., Tjong, Fleur V. Y., Narayan, Sanjiv M., Vogt, Julia E.

arXiv.org Artificial Intelligence

In the field of cardiac electrophysiology (EP), effectively reducing noise in intra-cardiac signals is crucial for the accurate diagnosis and treatment of arrhythmias and cardiomyopathies. However, traditional noise reduction techniques fall short in addressing the diverse noise patterns from various sources, often non-linear and non-stationary, present in these signals. This work introduces a Variational Autoencoder (VAE) model, aimed at improving the quality of intra-ventricular monophasic action potential (MAP) signal recordings. By constructing representations of clean signals from a dataset of 5706 time series from 42 patients diagnosed with ischemic cardiomyopathy, our approach demonstrates superior denoising performance when compared to conventional filtering methods commonly employed in clinical settings. We assess the effectiveness of our VAE model using various metrics, indicating its superior capability to denoise signals across different noise types, including time-varying non-linear noise frequently found in clinical settings. These results reveal that VAEs can eliminate diverse sources of noise in single beats, outperforming state-of-the-art denoising techniques and potentially improving treatment efficacy in cardiac EP.


GITO: Graph-Informed Transformer Operator for Learning Complex Partial Differential Equations

Ramezankhani, Milad, Patel, Janak M., Deodhar, Anirudh, Birru, Dagnachew

arXiv.org Artificial Intelligence

We present a novel graph-informed transformer operator (GITO) architecture for learning complex partial differential equation systems defined on irregular geometries and non-uniform meshes. GITO consists of two main modules: a hybrid graph transformer (HGT) and a transformer neural operator (TNO). HGT leverages a graph neural network (GNN) to encode local spatial relationships and a transformer to capture long-range dependencies. A self-attention fusion layer integrates the outputs of the GNN and transformer to enable more expressive feature learning on graph-structured data. TNO module employs linear-complexity cross-attention and self-attention layers to map encoded input functions to predictions at arbitrary query locations, ensuring discretization invariance and enabling zero-shot super-resolution across any mesh. Empirical results on benchmark PDE tasks demonstrate that GITO outperforms existing transformer-based neural operators, paving the way for efficient, mesh-agnostic surrogate solvers in engineering applications.


Marker Track: Accurate Fiducial Marker Tracking for Evaluation of Residual Motions During Breath-Hold Radiotherapy

Guo, Aimee, Mao, Weihua

arXiv.org Artificial Intelligence

Fiducial marker positions in projection image of cone-beam computed tomography (CBCT) scans have been studied to evaluate daily residual motion during breath-hold radiation therapy. Fiducial marker migration posed challenges in accurately locating markers, prompting the development of a novel algorithm that reconstructs volumetric probability maps of marker locations from filtered gradient maps of projections. This guides the development of a Python-based algorithm to detect fiducial markers in projection images using Meta AI's Segment Anything Model 2 (SAM 2). Retrospective data from a pancreatic cancer patient with two fiducial markers were analyzed. The three-dimensional (3D) marker positions from simulation computed tomography (CT) were compared to those reconstructed from CBCT images, revealing a decrease in relative distances between markers over time. Fiducial markers were successfully detected in 2777 out of 2786 projection frames. The average standard deviation of superior-inferior (SI) marker positions was 0.56 mm per breath-hold, with differences in average SI positions between two breath-holds in the same scan reaching up to 5.2 mm, and a gap of up to 7.3 mm between the end of the first and beginning of the second breath-hold. 3D marker positions were calculated using projection positions and confirmed marker migration. This method effectively calculates marker probability volume and enables accurate fiducial marker tracking during treatment without requiring any specialized equipment, additional radiation doses, or manual initialization and labeling. It has significant potential for automatically assessing daily residual motion to adjust planning margins, functioning as an adaptive radiation therapy tool.


FB-HyDON: Parameter-Efficient Physics-Informed Operator Learning of Complex PDEs via Hypernetwork and Finite Basis Domain Decomposition

Ramezankhani, Milad, Parekh, Rishi Yash, Deodhar, Anirudh, Birru, Dagnachew

arXiv.org Artificial Intelligence

Partial differential equations (PDEs) are integral in modeling and describing the dynamics of many complex systems in science and engineering. Numerical solvers such as finite element methods (FEMs) and finite difference methods (FDMs) often obtain the solution of PDEs by discretizing the domain and solving a finite-dimensional problem. However, obtaining high-resolution solutions to PDEs using numerical simulations for complex large-scale problems can be computationally expensive and prohibitive. There has been a growing interest in more efficient data-driven alternatives that can directly learn the underlying solutions from the available data without requiring explicit knowledge about the governing PDEs [3, 11]. More recently, operator learning has emerged as a promising paradigm, aiming to learn an unknown mathematical operator governing a system of PDEs [4]. They capture mappings between infinite-dimensional function spaces and have demonstrated potential in capturing complex solution behaviors [18, 15]. Furthermore, due to their inherent differentiability, they can be seamlessly applied to inverse problems, such as design optimization tasks [1]. Various architectures have been developed, including the Deep Neural Operator (DeepONet) [18], Fourier Neural Operator (FNO) [15], Graph Neural Operator [16], General Neural Operator Transformer (GNOT) [9] and Operator Transformer (OFormer) [14]. These models differ in their discretization methods and the approximation techniques they use to enhance efficiency and scalability.


Fair Evaluation of Federated Learning Algorithms for Automated Breast Density Classification: The Results of the 2022 ACR-NCI-NVIDIA Federated Learning Challenge

Schmidt, Kendall, Bearce, Benjamin, Chang, Ken, Coombs, Laura, Farahani, Keyvan, Elbatele, Marawan, Mouhebe, Kaouther, Marti, Robert, Zhang, Ruipeng, Zhang, Yao, Wang, Yanfeng, Hu, Yaojun, Ying, Haochao, Xu, Yuyang, Testagrose, Conrad, Demirer, Mutlu, Gupta, Vikash, Akünal, Ünal, Bujotzek, Markus, Maier-Hein, Klaus H., Qin, Yi, Li, Xiaomeng, Kalpathy-Cramer, Jayashree, Roth, Holger R.

arXiv.org Artificial Intelligence

The correct interpretation of breast density is important in the assessment of breast cancer risk. AI has been shown capable of accurately predicting breast density, however, due to the differences in imaging characteristics across mammography systems, models built using data from one system do not generalize well to other systems. Though federated learning (FL) has emerged as a way to improve the generalizability of AI without the need to share data, the best way to preserve features from all training data during FL is an active area of research. To explore FL methodology, the breast density classification FL challenge was hosted in partnership with the American College of Radiology, Harvard Medical School's Mass General Brigham, University of Colorado, NVIDIA, and the National Institutes of Health National Cancer Institute. Challenge participants were able to submit docker containers capable of implementing FL on three simulated medical facilities, each containing a unique large mammography dataset. The breast density FL challenge ran from June 15 to September 5, 2022, attracting seven finalists from around the world. The winning FL submission reached a linear kappa score of 0.653 on the challenge test data and 0.413 on an external testing dataset, scoring comparably to a model trained on the same data in a central location.


Optimizing Performance of Feedforward and Convolutional Neural Networks through Dynamic Activation Functions

Rane, Chinmay, Tyagi, Kanishka, Manry, Michael

arXiv.org Artificial Intelligence

Deep learning training training algorithms are a huge success in recent years in many fields including speech, text,image video etc. Deeper and deeper layers are proposed with huge success with resnet structures having around 152 layers. Shallow convolution neural networks(CNN's) are still an active research, where some phenomena are still unexplained. Activation functions used in the network are of utmost importance, as they provide non linearity to the networks. Relu's are the most commonly used activation function.We show a complex piece-wise linear(PWL) activation in the hidden layer. We show that these PWL activations work much better than relu activations in our networks for convolution neural networks and multilayer perceptrons. Result comparison in PyTorch for shallow and deep CNNs are given to further strengthen our case.


Problems and shortcuts in deep learning for screening mammography

Tsue, Trevor, Mombourquette, Brent, Taha, Ahmed, Matthews, Thomas Paul, Vu, Yen Nhi Truong, Su, Jason

arXiv.org Artificial Intelligence

This work reveals undiscovered challenges in the performance and generalizability of deep learning models. We (1) identify spurious shortcuts and evaluation issues that can inflate performance and (2) propose training and analysis methods to address them. We trained an AI model to classify cancer on a retrospective dataset of 120,112 US exams (3,467 cancers) acquired from 2008 to 2017 and 16,693 UK exams (5,655 cancers) acquired from 2011 to 2015. We evaluated on a screening mammography test set of 11,593 US exams (102 cancers; 7,594 women; age 57.1 \pm 11.0) and 1,880 UK exams (590 cancers; 1,745 women; age 63.3 \pm 7.2). A model trained on images of only view markers (no breast) achieved a 0.691 AUC. The original model trained on both datasets achieved a 0.945 AUC on the combined US+UK dataset but paradoxically only 0.838 and 0.892 on the US and UK datasets, respectively. Sampling cancers equally from both datasets during training mitigated this shortcut. A similar AUC paradox (0.903) occurred when evaluating diagnostic exams vs screening exams (0.862 vs 0.861, respectively). Removing diagnostic exams during training alleviated this bias. Finally, the model did not exhibit the AUC paradox over scanner models but still exhibited a bias toward Selenia Dimension (SD) over Hologic Selenia (HS) exams. Analysis showed that this AUC paradox occurred when a dataset attribute had values with a higher cancer prevalence (dataset bias) and the model consequently assigned a higher probability to these attribute values (model bias). Stratification and balancing cancer prevalence can mitigate shortcuts during evaluation. Dataset and model bias can introduce shortcuts and the AUC paradox, potentially pervasive issues within the healthcare AI space. Our methods can verify and mitigate shortcuts while providing a clear understanding of performance.